Analysis of Nyström method with sequential ridge leverage scores

نویسندگان

Daniele Calandriello

Alessandro Lazaric

Michal Valko

چکیده

Large-scale kernel ridge regression (KRR) is limited by the need to store a large kernel matrix Kt. To avoid storing the entire matrix Kt, Nyström methods subsample a subset of columns of the kernel matrix, and efficiently find an approximate KRR solution on the reconstructed K̃t. The chosen subsampling distribution in turn affects the statistical and computational tradeoffs. For KRR problems, [16, 1] show that a sampling distribution proportional to the ridge leverage scores (RLSs) provides strong reconstruction guarantees for K̃t. While exact RLSs are as difficult to compute as a KRR solution, we may be able to approximate them well enough. In this paper, we study KRR problems in a sequential setting and introduce the INK-ESTIMATE algorithm, that incrementally computes the RLSs estimates. INKESTIMATE maintains a small sketch of Kt, that at each step is used to compute an intermediate estimate of the RLSs. First, our sketch update does not require access to previously seen columns, and therefore a single pass over the kernel matrix is sufficient. Second, the algorithm requires a fixed, small space budget to run dependent only on the effective dimension of the kernel matrix. Finally, our sketch provides strong approximation guarantees on the distance ‖Kt− K̃t‖2, and on the statistical risk of the approximate KRR solution at any time, because all our guarantees hold at any intermediate step.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recursive Sampling for the Nystrom Method

We give the first algorithm for kernel Nyström approximation that runs in linear time in the number of training points and is provably accurate for all kernel matrices, without dependence on regularity or incoherence conditions. The algorithm projects the kernel onto a set of s landmark points sampled by their ridge leverage scores, requiring just O(ns) kernel evaluations and O(ns) additional r...

متن کامل

Provably Useful Kernel Matrix Approximation in Linear Time

متن کامل

On Column Selection in Approximate Kernel Canonical Correlation Analysis

We study the problem of column selection in large-scale kernel canonical correlation analysis (KCCA) using the Nyström approximation, where one approximates two positive semi-definite kernel matrices using “landmark” points from the training set. When building low-rank kernel approximations in KCCA, previous work mostly samples the landmarks uniformly at random from the training set. We propose...

متن کامل

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We pr...

متن کامل

A Statistical Method for Sequential Images – Based Process Monitoring

Today, with the growth of technology, monitoring processes by the use of video and satellite sensors have been more expanded, due to their rich and valuable information. Recently, some researchers have used sequential images for image defect detection because a single image is not sufficient for process monitoring. In this paper, by adding the time dimension to the image-based process monitorin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Analysis of Nyström method with sequential ridge leverage scores

نویسندگان

چکیده

منابع مشابه

Recursive Sampling for the Nystrom Method

Provably Useful Kernel Matrix Approximation in Linear Time

On Column Selection in Approximate Kernel Canonical Correlation Analysis

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

A Statistical Method for Sequential Images – Based Process Monitoring

عنوان ژورنال:

اشتراک گذاری